Frequency distribution of cytokine and associated transcription factor single nucleotide polymorphisms in Zimbabweans: Impact on schistosome infection and cytokine levels

Cytokines mediate T-helper (TH) responses that are crucial for determining the course of infection and disease. The expression of cytokines is regulated by transcription factors (TFs). Here we present the frequencies of single nucleotide polymorphisms (SNPs) in cytokine and TF genes in a Zimbabwean population, and further relate SNPs to susceptibility to schistosomiasis and cytokine levels. Individuals (N = 850) were genotyped for SNPs across the cytokines IL4, IL10, IL13, IL33, and IFNG, and their TFs STAT4, STAT5A/B, STAT6, GATA3, FOXP3, and TBX21 to determine allele frequencies. Circulatory levels of systemic and parasite-specific IL-4, IL-5, IL-10, IL-13, and IFNγ were quantified via enzyme-linked immunosorbent assay. Schistosoma haematobium infection was determined by enumerating parasite eggs excreted in urine by microscopy. SNP allele frequencies were related to infection status by case-control analysis and logistic regression, and egg burdens and systemic and parasite-specific cytokine levels by analysis of variance and linear regression. Novel findings were i) IL4 rs2070874*T’s association with protection from schistosomiasis, as carriage of ≥1 allele gave an odds ratio of infection of 0.597 (95% CIs, 0.421–0.848, p = 0.0021) and IFNG rs2069727*G’s association with susceptibility to schistosomiasis as carriage of ≥1 allele gave an odds ratio of infection of 1.692 (1.229–2.33, p = 0.0013). Neither IL4 rs2070874*T nor IFNG rs2069727*G were significantly associated with cytokine levels. This study found TH2-upregulating SNPs were more frequent among the Zimbabwean sample compared to African and European populations, highlighting the value of immunogenetic studies of African populations in the context of infectious diseases and other conditions, including allergic and atopic disease. In addition, the identification of novel infection-associated alleles in both TH1- and TH2-associated genes highlights the role of both in regulating and controlling responses to Schistosoma.


Abstract
Cytokines mediate T-helper (T H ) responses that are crucial for determining the course of infection and disease. The expression of cytokines is regulated by transcription factors (TFs). Here we present the frequencies of single nucleotide polymorphisms (SNPs) in cytokine and TF genes in a Zimbabwean population, and further relate SNPs to susceptibility to schistosomiasis and cytokine levels. Individuals (N = 850) were genotyped for SNPs across the cytokines IL4, IL10, IL13, IL33, and IFNG, and their TFs STAT4, STAT5A/B, STAT6, GATA3, FOXP3, and TBX21 to determine allele frequencies. Circulatory levels of systemic and parasite-specific IL-4, IL-5, IL-10, IL-13, and IFNγ were quantified via enzyme-linked immunosorbent assay. Schistosoma haematobium infection was determined by enumerating parasite eggs excreted in urine by microscopy. SNP allele frequencies were related to infection status by case-control analysis and logistic regression, and egg burdens and systemic and parasite-specific cytokine levels by analysis of variance and linear regression. Novel findings were i) IL4 rs2070874*T's association with protection from schistosomiasis, as carriage of �1 allele gave an odds ratio of infection of 0.597 (95% CIs, 0.421-0.848, p = 0.0021) and IFNG rs2069727*G's association with susceptibility to schistosomiasis as carriage of �1 allele gave an odds ratio of infection of 1.692 (1.229-2.33, p = 0.0013). Neither IL4 rs2070874*T nor IFNG rs2069727*G were significantly associated with cytokine levels. This study found T H 2-upregulating SNPs were more frequent among the Zimbabwean sample compared to African and European populations, highlighting the value of immunogenetic studies of African populations in the context of infectious diseases and other conditions, including allergic and atopic disease. In addition, the identification of novel infection-associated alleles in both T H 1-and T H 2-associated genes highlights the role of both in regulating and controlling responses to Schistosoma.

Introduction
The biological effects of the immune system are partly mediated by the expression of cytokines, which have a strong influence on the type and strength of immune responses. Although these cytokines can be produced by an array of immune cells, their predominant source is T helper (T H ) cells, particularly CD4 + T H cells. These cells have been largely divided into T H 1, T H 2, T H 17 and T-regulatory (Treg) cell types, each characterised by the cytokines they produce. T H 1 CD4 + T cells produce the key T H 1 cytokines interleukin-2 (IL-2), interferon-γ (IFNγ) and tumor necrosis factor-α, and this response is widely implicated in bacterial and viral infections, while T H 2 CD4 + T cell produce key T H 2 cytokines IL-4, IL-5 and IL-13, being the key response in parasite infections and allergic reactions [1]. T H 17 CD4 + produce IL-17, while Treg cells produce the regulatory and anti-inflammatory cytokine IL-10 [1]. The balance between these cytokines determines the phenotype of an immune response, subsequent immunopathology, and the eventual clearance or persistence of infection. The production of these cytokines is partly under genetic control via transcription factors (TFs), and changes in individual nucleotides coding for these cytokines or TFs (single nucleotide polymorphisms-SNPs) can significantly alter cytokines and their expression and therefore the nature of the immune response. Both T H 1 and T H 2 responses have been implicated in helminth infections in humans, as the balance between these two immune responses has been found to control the development of immunopathological responses [2,3]. Human studies and mouse models have found that the development of fibrotic and granulomatous responses following Schistosoma infection is primarily driven by T H 2 cytokines, with particular emphasis on the roles of IL-5 and IL-13 in driving immunopathology in response to parasite egg deposition [4,5]. Early mouse studies of S. mansoni infection demonstrated that blockade of the T H 2 response, either through exogenous administration of IL-12 or knockout of IL-4Rα, inhibits the development of granulomatous and fibrotic responses to schistosomes [6,7]. Conversely, T H 1 cytokines have been shown to limit immunopathology through a negative feedback loop with T H 2 responses; for example, high IFNγ production has been found to correlate with reductions in liver fibrosis in mice [5,8,9]. However, studies of S. mansoni infections in mice lacking the IFNγ receptor found reductions in granuloma size and hastened progression to chronic immune responses to parasites [10]. Therefore, the dynamic between T H 1 and T H 2 cytokines in schistosome infection seems critical to directing the typical immunological response, and disruption to either arm may result in abnormal immune responses to infection. In addition to mediating the immune response to Schistosoma, our studies have shown that the balance between T H 1 and T H 2 responses is of critical importance in the development of protective immunity against schistosomiasis [3,11,12]. Examining how this balance is regulated at the genetic level is highly relevant to understanding how responses, susceptibility to, and resistance to schistosomiasis are biologically mediated.
Host genetics have been implicated in susceptibility human schistosome infection, with the genes localised on chromosome 5 in the region 5q31-q33 having been shown to have key roles [13]. This region carries genes encoding T H 2 cytokines IL-4, IL-5, and IL-13 [14]. Genetic studies have revealed associations between SNPs in genes encoding cytokines and TFs and susceptibility to schistosomiasis, including in STAT6, IL4, IL5, IL10, and IL13 [15][16][17][18][19][20][21]. Recently, Choto and colleagues identified an association between IL13 rs1800925 and elevated IL-13 concentrations in schistosome-uninfected but not schistosome-infected individuals in Zimbabwe [22]. In addition, Marume and colleagues identified an association between IL10 SNP rs1800871, protection from S. haematobium infection and lower IL-10 production [23]. Genetic variation within cytokine and TF genes that are associated with schistosomiasis are often also associated with perturbation to the T H 1/T H 2 balance and the expression of cytokines involved in responding to infection. Nonetheless, to date there have been no comprehensive studies documenting the frequency of cytokine-and TF-associated SNPs and their relationship to helminth infection and cytokine levels in an African population, and there is a paucity of genetic research focussing on individuals of African ancestry and neglected tropical diseases [24]. Thus, the aim of this study was to genotype SNPs in cytokine markers of T-helper responses and associated TFs and to relate these to levels of S. haematobium infection and corresponding cytokines and having done so we identified a novel protective allele in the IL4 gene (rs2070874 � T) and a novel risk allele in the IFNG gene (rs2069727 � G).

Ethics statement
Ethical approval was granted by the Medical Research Council of Zimbabwe (MRCZ/A/1408) and the University of Zimbabwe Institutional Review Board. The Provincial Medical Director granted local permission. Community members were informed of the study aims and procedures in their local language (Shona), and compliant participants provided written consent or assent from a parent/guardian if aged <18-years-old.

Study area and participants
This work is part of a larger study characterising the nature and development of schistosomespecific immunity in human populations, with field work conducted from 2008-2010. Participants were recruited from two villages (Magaya and Chipinda) in Murewa District (173 8 0 49@S 31˚46 0 39@E), Mashonaland East Province, Zimbabwe where S. haematobium is endemic. The eligibility criteria for this study were as follows: participants had to i) be life-long residents of the area, ii) provide a minimum of two urine and two stool samples on consecutive days and, iii) be negative for S. mansoni, soil-transmitted helminthiases (STH), malaria and human immunodeficiency virus (HIV). Following the application of inclusion criteria, 850 individuals were recruited to participate in this study. Subsequently, 23 individuals were excluded from parasitological analyses due to missing or incomplete S. haematobium egg counts, though were included in calculations of genotype and allele frequencies within the study population. The age range of participants was from three-years-old to 86-years-old, and the median age was 12-years old. Participants were 43.88% male and 56.12% female.

Parasitology and sample collection
S. haematobium, S. mansoni and STH eggs were quantified from a minimum of two urine and stool samples, as previously described [3]. Mean S. haematobium infection intensity was determined by urine microscopy from at least two urine samples provided on consecutive days. While more sensitive tests exist for the detection of lower intensity and prepatent Schistosoma infections, such as nucleic acid-based tests or immunological assays, the cost associated with these, given the sample size and field location, was prohibitive [25,26]. 5ml of venous blood was collected from which a drop was used for blood smear microscopic detection of Plasmodium spp and 1ml was stored for genotyping studies. The rest of the blood was processed as previously described to extract serum for quantifying cytokine levels and malaria and HIV serodiagnosis [3]. Malaria status was confirmed using Paracheck rapid tests (Orchid Biomedical Systems) and HIV was detected by DoubleCheckGold HIV1&2 Whole Blood test (Orgenics), with positive cases confirmed using Determine HIV1/2 Ag/Ag Combo (InvernessMedical).

Genotyping
We selected signature cytokines and their associated TFs and conducted a literature search to identify published SNPs in their genes. Candidate SNPs were identified via literature search  for the following genes: IL4, IL10, IL13, IL33, IFNG, STAT4, STAT5A, STAT5B, STAT6, GATA3, FOXP3, and TBX21. SNPs were excluded if they had previously been reported to have a minor allele frequency of <0.1 in the Yoruba (West African) population as insufficient allele frequencies would prevent the sufficient statistical power to detect rare effects. Furthermore, those with a recorded association with allergy, asthma, and altered immune system function including effects on cytokine or antibody production were studied further. This resulted in 35 SNPs being selected for this study. SNPs are referred to throughout using the SNP ID registered on the National Center for Biotechnology Information's (NCBI) SNP database. Genomic DNA was extracted from blood samples and subject to targeted genotyping by sequencing of 35 SNPs, performed by LGC Genomics (Hoddesdon, UK).

Serology
Both systemic and parasite-specific (antigen-stimulated) IL-4, IL-5, IL-13, IL-10, and IFNγ concentrations were measured by enzyme linked immunosorbent assay (ELISA). A random subgroup of participants (N = 233) resident in Magaya were selected for serological studies and were 48.91% male and 51.09% female. The median age was 12-years-old and the range of ages in this group was from three-years-old to 80-years-old. Both infected and uninfected individuals were included in this analysis. Systemic cytokine levels were determined in duplicate in sera by capture ELISA, as previously described [11]. Parasite-specific cytokines were also measured in duplicate by ELISA from supernatants collected from whole blood cultures stimulated for 48 hours at 37˚C with S. haematobium soluble egg antigen (SEA) (N = 233), cercarial antigen preparation (CAP) (N = 67), or whole worm homogenate (WWH) (N = 233) as previously described [3]. Briefly, sera (systemic cytokines) or blood culture supernatants (parasitespecific cytokines) were added in duplicate to 96-well plates coated with 1ug/ml capture antibody for IL-4, IL-5, IL-13, IL-10 or IFN-γ (BD Biosciences) and incubated overnight at 4˚C. Subsequently, 0.5μg/ml (IFNγ only) or 1μg/ml biotinylated detection antibody and was added for two hours at 37˚C before streptavidin-horse radish peroxidase for two hours at 37˚C. Lastly, 3,3'-5,5'-tetramethylbenzidine substrate was added and developed for five minutes.
Samples were then analysed with spectrophotometry at 450nm and compared to a standard curve for each cytokine for quantification.

Statistical analysis
All statistical analyses were performed using SPSS Statistics Version 25 unless otherwise stated. Infection intensities (egg counts, measured as eggs/10ml urine) and cytokine concentrations were log 10 (x+1) and square-root(x+1) transformed, respectively, in statistical analyses to meet the assumptions of parametric analysis. All figures were produced in GraphPad Prism 8 Version 9.1.0 for Windows (GraphPad Software, San Diego, California USA, www.graphpad.com), unless otherwise stated. 95% confidence intervals (CIs) were calculated for frequencies and proportions using exact binomial tests. Individuals included in analyses were not case-matched, but potential confounders were accounted for in statistical models. These included participant village, age, and sex when analysing infection status/intensity, and participant infection intensity, age, and sex when analysing cytokine concentrations. Controlling for village when analysing cytokine levels was not necessary as cytokine quantification was performed only on individuals residing in Magaya.

Allele frequency analysis
Minor allele frequencies (MAFs) among individuals of African and European ancestry for each SNP were obtained from the NCBI ALFA database [27] and Pearson's Chi-Square tests were performed to compare these frequencies with those of the study population. LD between SNPs was analysed using Haploview Version 2 and PLINK Version 1.9 [28,29]. Haplotype blocks of SNPs in strong LD were defined as one or more pairs of SNPs where the 95% CIs of the D' value between them has a lower limit �0.7 and an upper limit �0. 98 [30]. SNPs that significantly (p < 0.0001) deviated from the Hardy Weinberg Equilibrium (HWE) were excluded. A total of 54 individuals were excluded on the basis of >50% missing genotypes or missing parasitological data, therefore 796 individuals were included in this analysis.

Relating genotype of single cytokines to S. haematobium infection status and cytokine levels
Pearson's Chi-Square tests were performed using Haploview 2 to test for significant differences in the frequencies of alleles and haplotypes between schistosome-positive and schistosomenegative individuals. Binary logistic regression was conducted using the genotype of SNPs as predictors of infection while controlling for participant sex, village, and age. This analysis was performed using genotypic (AA vs Aa, AA vs aa), dominant (AA vs Aa + aa), and recessive (AA + Aa vs aa) genetic models to further examine these associations (where A = reference allele, a = minor allele) [31]. Individual SNPs were also related to both systemic and parasitespecific cytokine (IL-4, IL-5, IL-10, IL-13, IFNγ) concentrations. Transformed cytokine concentrations were subject to analysis of variance (ANOVA) (sequential sums of squares) and the effect of each SNP measured following adjustment for confounding variables of sex, age, and transformed infection intensity. Following this, significant overall effects were further analysed by post-hoc pairwise comparisons, adjusted by Bonferroni correction to account for multiple testing. Some relationships could not be reliably tested due to small sample sizes (N < 10) arising from infrequent genotypes, and thus were not included.

Relating SNP principal components to infection and cytokine levels
Genotypes of all SNPs were subject to PCA. PCs with an eigenvalue >1 and factor loadings �0.5 or �-0.5, or the highest loading score for a SNP if all were �0.5 or �-0.5, were included in analyses. Scores for each PC for each individual were extracted using the regression method. Binary logistic regression and multiple linear regression were utilised to predict infection status and intensity, respectively, adjusting for participant village, sex, and age before PC scores were entered stepwise as predictors. Cytokine concentrations were also related to PCs through multiple linear regression. Systemic and parasite-specific IL-4, IL-5, IL-10, IL-13 and IFNγ concentrations were entered as dependent variables into regression models including participant sex, age, and transformed infection intensity as confounders, before PCs were entered stepwise to identify significant relationships.

Regression analysis of S. haematobium infection
Corroborating the previous analysis, IL4 rs2070874 � T and IFNG rs2069727 � G were significantly associated with infection in a logistic regression model after adjusting for the confounders of age, sex and village. Individuals with the IL4 rs2070874 genotypes C:T or T:T had ORs of 0.566 (95% CIs: 0.375-0.853, p = 0.0066) and 0.616 (95% CIs: 0.425-0.893, p = 0.010), respectively, relative to the C:C genotype ( Fig 3A). Additionally, in a dominant model, individuals carrying at least one copy of IL4 rs2070874 � T had an OR of 0.597 (95% CIs: 0.421-0.848, p = 0.0021) relative to the C:C genotype. Under a recessive model, no significant differences were found when comparing individuals carrying at least one copy of the C allele to those homozygous for the T allele. Secondly, individuals with the IFNG rs2069727 genotype G:A had an OR of 1.743 (1.255-2.421, p = 0.0009) relative to individuals with the A:A genotype ( Fig  3B). Individuals with the G:G genotype were not found to have a significant OR for egg positivity relative to individuals with the A:A genotype, however the G:G genotype had a frequency of 0.019 (N = 16), thus this analysis lacks power. In a dominant model, individuals carrying at least one copy of IFNG rs2069727 � G had a combined OR of 1.692 (95% CIs: 1.229-2.33, p = 0.0013), relative to individuals with the A:A genotype. In addition, under a recessive model, no significant differences were found.

Analysis of cytokine production
A subgroup of participants (N = 233) resident in Magaya were further investigated to study cytokine responses. Among this subgroup, the prevalence of S. haematobium infection was 52.586%, and the mean egg count was 29.997 eggs/10ml urine (+/-SD 87.298). Participants were grouped on the basis of SNP genotype and not infection status, and therefore infection status-dependent effects were not examined. SNPs were investigated to analyse effects on systemic and parasite-specific cytokine concentrations using ANOVA to compare genotypes ( Fig  4 and Table 5). This indicated six significant relationships between SNPs and cytokine levels following adjustment for sex, age, and infection. Firstly, IL13 rs20541 was significantly associated with systemic IL-5 concentrations (F = 4.318, p = 0.015), whereby individuals with the A: A genotype had a higher mean IL-5 concentration than individuals with the A:G and G:G genotypes, however these comparisons were not significant following Bonferroni post-hoc analysis. FOXP3 rs2232365 was also significantly associated with systemic IL-5 concentrations (F = 3.382, p = 0.037), and post-hoc analysis found that individuals with the A:A genotype had a significantly higher mean IL-5 concentration compared to individuals with the G:G genotype (p = 0.0071). Systemic IL-10 was significantly associated with the FOXP3 SNP rs2294021 (F = 3.315, p = 0.0039), and post-hoc analysis indicated that individuals with the T:T genotype had a significantly lower mean IL-10 concentration compared to individuals with the T:C genotype (p = 0.032) but not those with the C:C genotype. Parasite-specific cytokine levels were also influenced by SNP genotypes. CAP-specific IL-4 was significantly associated with the TBX21 SNP rs16947078 (F = 4.763, p = 0.037), whereby individuals with the A:A genotype had a significantly higher mean concentration compared to individuals of the A:G genotype (note: CAP-specific IL-4 was not measured in any individuals with the G:G genotype). Additionally, SEA-specific IFNγ was associated with both GATA3 rs4143094 and STAT6 rs324015. Firstly, GATA3 rs4143094 was significantly associated with SEA-specific IFNγ (F = 4.212, p = 0.017), with a trend towards lower mean concentrations associated with the G allele, however post-hoc analysis did not indicate any significant pairwise comparisons between genotypes. Lastly, SEA-specific IFNγ was significantly associated with STAT6 rs324015 (F = 4.857, p = 0.0092), and post-hoc analyses indicated that individuals with the A:A genotype had a significantly higher mean SEA-specific IFNγ concentration compared to individuals with the G: G genotype (p = 0.046).

Discussion
Cytokines are crucial for the type of immune response mounted against pathogens, the expression of which is partially controlled by TFs. Here, we analysed the frequency of SNPs in genes encoding cytokine markers of T H 1, T H 2 and Treg responses and their associated TFs. We Table 4 related the presence of these SNPs to the risk of schistosome infection as well as levels of systemic and schistosome-specific cytokines. Our most significant findings were that IL4 rs2070874 � T is significantly associated with a reduction in schistosome infection risk, while IFNG rs2069727 � G and the haplotype GGT across IFNG SNPs rs2069727, rs2069718, rs2069705 were associated with increased schistosome infection risk. For both IL4 rs2070874 � T and IFNG rs2069727 � G, it is apparent from these results that carriage of one allele is sufficient to elicit the associated phenotype. To our knowledge, this is the first time either SNP has been associated with susceptibility to schistosomiasis. The protective IL4 rs2070874 � T allele had a frequency of 0.542 within the Zimbabwean sample, which is significantly higher than both African and European populations, in which the allele has frequencies of 0.397 and 0.143, respectively. In addition, the risk IFNG rs2069727 � G allele had a frequency of 0.153 within the Zimbabwean sample, which is significantly lower than both African and European populations, in which the allele has frequencies of 0.212 and 0.468, respectively. Therefore, the protective and risk alleles described here are more and less frequent, respectively, among the study population compared to individuals of both African and European descent. Of the 35 SNPs under investigation, we found that 65.71% and 91.43% had MAFs that were significantly different between the study sample and African and European populations, respectively. Such high levels of difference between the study population and Europeans are unsurprising. It is equally unsurprising that the study population displayed strong differences to aggregated African populations, as ALFA combines population genetic data across the entire continent of Africa, where there are considerable between-and within-subpopulation genetic differences [101]. By comparing MAFs within the Zimbabwean sample to Africans and Europeans, increased allele frequencies associated with elevated T H 2 function were found, particularly compared to Europeans. For example, IL4 rs2243250 � T had a frequency of 0.772 within the study sample, compared to 0.642 and 0.144 among Africans and Europeans, respectively, and this allele has previously been associated with the upregulation of T H 2 responses including increased IL-4 and immunoglobulin-E (IgE) production [60,63]. Additionally, IL13 rs1295686 � G, which has frequencies of 0.225, 0.375 and 0.796 among the Zimbabwean sample, Africans and Europeans, respectively, has been associated with decreased risk of asthma and reduced IgE expression [47].

. Case-control analysis of haplotype frequencies between schistosome-infected (N = 354) and-uninfected individuals (N = 442
The rs2070874 � T allele was found here to be protective against S. haematobium and significantly more frequent within the study sample. IL4 rs2070874 is located within the 5' untranslated region (UTR) of the IL4 gene. The 5' UTR region of genes is associated with controlling translation efficiency through the binding of TFs and RNA polymerase, and the formation of the ribosomal initiation complex [102,103]. This raises the possibility that IL4 rs2070874 � T may influence the translation of IL4, and while not observed in this present study, IL4 rs2070874 � T has previously been associated with elevated levels of plasma IL-4 [68]. We also identified a novel association between the IFNG gene and schistosomiasis susceptibility, as IFNG rs2069727 � G was found here to be associated with increased risk of infection. As with IL4 rs2070874, IFNG rs2069727 is not located within a coding region as it is found approximately 500bp downstream of the IFNG gene. It is possible that this polymorphism similarly affects the binding of regulatory factors and as such modulates the translation of the IFNG gene [103], and though not replicated here, IFNG rs2069727 � G has previously been associated with altered IFNγ production [84,85]. Further mechanistic investigations are required to fully elucidate the nature of these polymorphisms and their functional impacts, however given that they are both located outside of coding regions, modulations to the regulation of gene expression is a leading hypothesis on the biomolecular consequences of these polymorphisms. In the absence of a mechanistic explanation and without evidence here of associations with cytokine production, it is difficult to draw conclusions on how these SNPs fit into the immunological response to schistosomes and the development of protective immunity. Given each SNP's identified association with susceptibility to infection, it may be that the biological consequences of these polymorphisms lies in the innate immune response generated following infection. Research by this group and others has found the early immune response to egg deposition relies on both Th1 and Th2 cytokines, including both IL-4 and IFN-γ, as well as cellular elements including alternatively-activated macrophages, monocytes, and innate lymphoid cells, and that these elements both rely on and amplify cytokine responses [3,[104][105][106][107]. Therefore, it could be that alterations to innate cytokine responses at early stages of infection are influenced by a disruption to the T H 1/T H 2 balance arising from these SNPs; however, without further mechanistic evidence, this remains speculative. Several other SNPs have been identified as influencing risk of schistosome infection, including within the IL4 gene as Adedokun and colleagues identified an association between IL4 rs2243250 and increased risk of S. haematobium in Nigerian children [21]. IL4 rs2243250 was in violation of HWE in the present study and as such case-control analysis of S. haematobium infection was not performed for this SNP. Ellis and colleagues conducted a similar genetic association study where they found no association between IL4 rs2070874 and risk of infection with S. japonicum in a Chinese population [17]. However, comparability between this study and ours is limited given the differences in population, underlying genetic linkage structures, and Schistosoma species. This present study is, to the best of our knowledge, the first genetic association study to examine IL4 rs2070874 in the context of S. haematobium. This study is also, to the best of our knowledge, the first to support a genetic association between IFNG rs2069727 and infection with Schistosoma, and the first to identify an association between schistosomiasis risk and the IFNG gene. Due to our finding that IFNG rs2069727 is in LD with both IFNG rs2069705 and IFNG rs2069718, it is difficult to discern the associations with IFNG rs2069727 from either of these SNPs. However, given the low r 2 values between IFNG rs2069727, IFNG rs2069718, and IFNG rs2069705, and the lack of any significant independent association with IFNG rs2069705 or IFNG rs2069718 alone, the risk allele and associated phenotype is likely to be more closely associated with IFNG rs2069727. Previously, IFNG rs2069727 � G has been associated with elevated IFNγ concentrations [84] and while no association was found here with IFNγ concentrations, it is plausible that an associated disturbance to the T H 1/T H 2 balance may underlie the increased risk of schistosomiasis. Therefore, as with IL4 rs2070874, further examination of the biological effects of these SNPs would prove beneficial.  One SNP included in our case-control analysis (IL13 rs20541) had previously been associated with S. mansoni and S. japonicum infection, though these findings were not replicated here [20,48]. There is an apparent lack of reproducibility and generalisability in these associations between populations and Schistosoma species, the former of which may be due to differences in LD structures. In addition, it is widely acknowledged that disease susceptibility is mostly the product of whole-genome variation, rather than particularly deleterious or advantageous individual alleles [108]. Therefore, associations being made with individual SNPs or with a limited range of alleles suffer from limited biological relevance. The ability to capture variation across the entire genome would be beneficial in providing a more comprehensive analysis. Our understanding of the complex role of genes in determining susceptibility to schistosome infection is hindered by a lack of GWASs-to date, no published study has performed a GWAS on schistosomiasis in humans. The adoption of such techniques would broaden our knowledge of the role of host genetics in schistosomiasis, other helminthiases, and neglected tropical diseases. The analysis conducted here identified relationships between SNPs and levels of both systemic and parasite-specific cytokine responses. This included the FOXP3 SNP rs2232365, which was found to be significantly associated with lower systemic IL-5 both individually and when combined with FOXP3 SNPs rs11091253 and rs2294021 in PCA-based regression analysis. FOXP3, the master regulatory TF of Treg responses, is responsible also for downregulating T H 1 and T H 2 immune responses, and FOXP3 rs2232365 has been associated with higher FOXP3 expression [94], potentially underlying the observed decreased IL-5 concentrations. In addition, the STAT6 SNP rs324015 was associated with reduced SEA-specific IFNγ when comparing genotypes, and this SNP has been previously associated with reduced asthma risk and reduced IgE [81,82], thereby indicating that this polymorphism results in a dampening of T H 2 responses. Our observation is therefore in accordance with these previous findings. While none of the SNPs identified as being associated with cytokine concentrations were also associated with susceptibility to schistosomiasis, it would be of value to examine whether these SNPs are associated with changes in T H 2-mediated immunopathology. For example, the IL13 promoter polymorphism rs1800925 (not studied here) has previously been associated with both elevated IL-13 expression and an increase in liver fibrosis associated with S. japonicum infection [48]. The observation made here that STAT6 rs324015 was associated with elevated schistosome egg antigen-specific IFNγ may have implications for early immune responses such as reducing T H 2-mediated immunopathology following egg deposition, and examination of the immunological consequences of this SNP and others on responses to schistosomiasis would shed further light on the role of host genetics in S. haematobium infection.
Those SNPs identified here as being significantly associated with S. haematobium susceptibility were not individually associated with levels of any systemic or parasite-specific cytokines, despite having been associated with expression of their respective cytokines previously [68,84]. A number of reasons exist for this discrepancy, including differences in linkage structures between genes in the study population of this research and that in the study populations of previous research. For example, neither paper previously finding associations between IL4 rs2070874 � T and IFNG rs2069727 � G and the expression of their respective cytokines studied individuals of African heritage. It is known that individuals of African heritage possess genetic linkage structures significantly different to those of European and other ancestries [109]; therefore, it would not follow that a genetic association in a non-African population would necessarily be replicated in an African population. A weak but significant relationship was identified between the PC representing IL4 SNPs rs2070874, rs2243248 and rs2243250 and CAP-specific IL-10, however none of these variants were individually associated with these cytokines and therefore it is not possible to deduce which is most likely to be the causative allele of this weak effect. Thus, these results do not provide evidence to hypothesise the underlying mechanism between infection with S. haematobium and those variants identified as risk and protective alleles.
This study focusses on an underrepresented group among genetic association studies, as most population level genetic research focusses on individuals of European ancestry [24]. In addition, the genetic analysis of infectious disease susceptibility is an underdeveloped field relative to other diseases including metabolic diseases and cancer [110]. As such, this study provides novel analyses and findings on both an underrepresented population and disease. Genetic studies of individuals of African ancestry are particularly important in infectious diseases given the plethora of endemic diseases found on the continent, and the unique genetic background against which these occur. Population genetics is becoming increasingly recognised as an important modulator of infectious diseases, influencing susceptibility and disease severity [24]. Expanding analysis of the genetic basis of infectious disease susceptibility beyond populations of European ancestry is beneficial to understanding how such diseases differentially affect populations of different heritage and how interventions can be best informed to account for this. For example, the SNP rs12979680 in the IL28B gene encoding type III IFN-λ-3 has been found to associate with improved clearance and response to treatment in hepatitis C virus infection; however, this polymorphism is vastly more common among individuals of Asian and European ancestry compared to those of African ancestry, and this difference in host genetics is thought to partially underlie disparities in hepatitis C virus infection outcomes between African-Americans and European descendants [111,112]. Furthermore, genetics has been suggested as an underlying factor in the higher frequencies of allergic and atopic diseases observed among individuals of African heritage compared to those of European heritage [113,114]. Host genetics has also been hypothesised to be an underlying factor in the way in which the SARS-CoV-2 pandemic has manifested in sub-Saharan Africa, as substantially lower morbidity and mortality arising from the pandemic has occurred compared to European and North American countries [115,116]. The contextualisation of genetic associations with population-level allele frequencies is of additional benefit, as here in this study the novel protective and risk alleles were found to have higher and lower frequencies, respectively, in the study population relative to European populations. The frequencies of a range of immune system polymorphisms reported in this study is valuable as a contribution to the overall understanding of immunogenetics of both Zimbabweans and individuals of sub-Saharan African descent. Such understanding and continued research may contribute in the future to the use of population immunogenetics in the design and implementation of interventions against a range of infectious diseases.
The study presented here benefits from a number of strengths; this paper focusses on an underrepresented population and disease, thereby filling a gap in research into host genetic susceptibility. Furthermore, the sample size allowed the analysis of rarer alleles and the characterisation of the frequency of these SNPs within the population. However, there remains a number of limitations to the work described here. The exclusion of individuals with either Plasmodium, HIV or STH infection will, inevitably, have excluded a significant number of individuals and a particular demographic from participation. Some estimates have suggested that the prevalence of co-infection with Plasmodium and Schistosoma in some regions of sub-Saharan Africa may be as high as 30% [117], however the exclusion of co-infections from this study was necessary in order to control for potentially confounding concurrent immunological responses to Plasmodium infection. Additionally, although urinary schistosomiasis increases HIV risk and therefore may represent a significant proportion of all schistosome-infected individuals [118], prospective participants found to be infected with HIV were excluded to remove the confounding effects of the immunosuppressive nature of HIV infection. An additional limitation of this study is the unequal age distribution, in that the median age skews significantly young. While age was adjusted for in statistical modelling, it remains to be seen whether adult age-related effects exist within the results described here.
In summary, here we report on the frequency of SNPs within cytokine and TF genes and describe differences between the Zimbabwean study sample and African and European populations. In addition, we identify novel dominant protective and risk alleles at IL4 rs2070874 � T and IFNG rs2069727 � G, respectively, for urogenital schistosomiasis and significantly associate the IFNG gene with schistosomiasis susceptibility for the first time. These findings add to the growing understanding of the role of genetic variation in schistosomiasis, emphasise the duality of T H responses against schistosomes, and indicate important points of future investigation that may reveal more about the mechanisms of the host immune response to schistosome infection. These findings identify where genetic elements associated with elevated T H 2 reactivity are more frequently observed among the study sample, contributing to a developing understanding of immunogenetics among individuals of African ancestry and highlight the need to improve the understanding of population-specific immunogenetics in the context of schistosomiasis, helminth infections, and neglected tropical diseases more widely.